Hierarchical acoustic modeling based on random-effects regression for automatic speech recognition

نویسندگان

  • Yan Han
  • Lou Boves
چکیده

Recent research on human intelligence [1] suggests that the auditory system has a hierarchical structure, in which the lower levels store individual properties, and the upper levels store the group properties of utterances. However, most of the conventional automatic recognizers adopt a single level model structure. In structure-based models, such as HMM and parametric trajectory models, only the group properties of utterances are modeled. In template-based models, only the individual properties of utterances are exploited. In this paper, we propose a novel hierarchical acoustic model to simulate the human auditory hierarchy, in which both the group and the individual properties of utterances can be explicitly addressed. Furthermore, we developed two evaluation methods, namely bottom-up and top-down test, to simulate the prediction-verification loops in human hearing. The model is evaluated on a TIMIT vowel classification task. The proposed hierarchical model significantly outperforms parametric trajectory models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Speech Recognition using Acoustic Landmarks and Binary Phonetic Feature Classifiers

In spite of decades of research, Automatic Speech Recognition (ASR) is far from reaching the goal of performance close to Human Speech Recognition (HSR). One of the reasons for unsatisfactory performance of the state-of-the-art ASR systems, that are based largely on Hidden Markov Models (HMMs), is the inferior acoustic modeling of low level or phonetic level linguistic information in the speech...

متن کامل

Continuous Hindi Speech Recognition using Monophone based Acoustic Modeling

Speech is a natural way of communication and it provides an intuitive user interface to machines. Although the performance of automatic speech recognition (ASR) system is far from perfect. The overall performance of any speech recognition system is highly depends on the acoustic modeling. Hence generation of an accurate and robust acoustic model holds the key to satisfactory recognition perform...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007